To find out what happens when words are used in a context where their potential for vagueness comes to the fore, Experiment 1 used three arrays (rather than two arrays as in pilot experiment B) so that the vague description had more than one possible referent; it used indefinite articles in the vague instructions to avoid the impression that only one response counted as correct; and it was carried out without error feedback.
An indication that the potential for vagueness was realised in Experiment 1 is that the borderline response was chosen fairly often: 16% of the time.
In Experiment 1, an item was a referring expression instruction followed by a set of three dot arrays defined by a triple of numbers, representing the number of dots in the left, middle, and right arrays. We used four different triples of numbers: (6,15,24); (16,25,34); (26,35,44); (36,45,54). Each set of arrays comprised three arrays (instead of two as in pilot experiment B); the array representing the central number was always presented in the middle of the three; there were two flanking arrays where one had fewer dots than the central array and the other had more, and these flanking arrays appeared equally often on the left and right of the central array.
The way in which borderline responses were construed is as follows, using as an example the array (6:15:24) and instructions that identified the smaller flanking array (6). 6 was classified as the expected response. 15 was classified as the borderline response. 24 was classified as the extreme response.
In the “vague numerical” condition the instruction was “Choose a square with about 10 dots” – none of the arrays contained exactly 10 dots, but 10 is closer to 6 than it is to 15, making 6 a better response to that instruction, 15 a borderline response, and 24 an extreme response.
In the vague verbal condition we used “Choose a square with few dots”. We considered this to be equivalent in terms of which responses were expected (6), borderline (15) and extreme (24).
In the crisp numerical condition we used “Choose the square with 6 dots”. The smaller flanking array always contained exactly the specified number of dots. We considered this to be equivalent in terms of which responses were expected (6), borderline (15) and extreme (24).
For crisp verbal, we used “Choose the square with the fewest dots”. We considered this to be equivalent in terms of which responses were expected (6), borderline (15) and extreme (24).
On each trial, first the referring expression that constituted the instruction for that trial was displayed (e.g., “Choose a square with about 10 dots”). Participants then pressed a key to indicate that they had read the instruction.
The instruction remained on screen, and after 1000 ms, the arrays appeared (see Figure below).Response time was measured from the presentation of the arrays until the keypress indicating the participant’s choice. The trial would timeout after 60 seconds if there was no response.
In this experiment, no feedback was given. This was because, in the vague conditions, we did not regard any response as “correct” or “incorrect”, but instead as “expected response”; “borderline response”; and “extreme response”, and we did not want to draw participants’ attention to this distinction explicitly. Which choice the participant made was recorded for analysis.
| Item | Quantity | Number | Crisp | Vague |
|---|---|---|---|---|
| 06:15:24 | Small | Numeric | Choose the square with 6 dots | Choose a square with about 10 dots |
| 06:15:24 | Small | Verbal | Choose the square with the fewest dots | Choose a square with few dots |
| 06:15:24 | Large | Numeric | Choose the square with 24 dots | Choose a square with about 20 dots |
| 06:15:24 | Large | Verbal | Choose the square with the most dots | Choose a square with many dots |
| 16:25:34 | Small | Numeric | Choose the square with 16 dots | Choose a square with about 20 dots |
| 16:25:34 | Small | Verbal | Choose the square with the fewest dots | Choose a square with few dots |
| 16:25:34 | Large | Numeric | Choose the square with 34 dots | Choose a square with about 30 dots |
| 16:25:34 | Large | Verbal | Choose the square with the most dots | Choose a square with many dots |
| 26:35:44 | Small | Numeric | Choose the square with 26 dots | Choose a square with about 30 dots |
| 26:35:44 | Small | Verbal | Choose the square with the fewest dots | Choose a square with few dots |
| 26:35:44 | Large | Numeric | Choose the square with 44 dots | Choose a square with about 40 dots |
| 26:35:44 | Large | Verbal | Choose the square with the most dots | Choose a square with many dots |
| 36:45:54 | Small | Numeric | Choose the square with 36 dots | Choose a square with about 40 dots |
| 36:45:54 | Small | Verbal | Choose the square with the fewest dots | Choose a square with few dots |
| 36:45:54 | Large | Numeric | Choose the square with 54 dots | Choose a square with about 50 dots |
| 36:45:54 | Large | Verbal | Choose the square with the most dots | Choose a square with many dots |
We formulated the following hypotheses for Experiment 1:
Original full model
dat_model <- dat
dat_model$c_Vag <- ifelse(dat_model$Vagueness=="Crisp", -0.5, 0.5)
dat_model$c_Num <- ifelse(dat_model$Number=="Verbal", -0.5, 0.5)
dat_model$c_Itm <- ifelse(dat_model$Item=="06:15:24", -.75, ifelse(dat_model$Item=="16:25:34", -.25, ifelse(dat_model$Item=="26:35:44", .25, .75)))
rtFullModel <- lmerTest::lmer(1 + RT_log ~ c_Vag * c_Num + c_Itm + (1 + c_Vag * c_Num + c_Itm | Subject), dat_model)
pretty_coef_table(rtFullModel, "rtFullModel")
| term | 𝛽 | s.e. | d.f. | 𝑡 | 𝑝 | Pr(>|t|) | sig. |
|---|---|---|---|---|---|---|---|
| c_Vag | 0.058 | 0.013 | 28.99 | 4.55 | 8.87e-05 | <0.001 | *** |
| c_Num | 0.365 | 0.072 | 29.00 | 5.08 | 2.03e-05 | <0.001 | *** |
| c_Itm | 0.120 | 0.017 | 28.98 | 7.11 | 8.12e-08 | <0.001 | *** |
| c_Vag:c_Num | 0.069 | 0.027 | 29.04 | 2.53 | 1.69e-02 | <0.05 | * |
Numeric only
dat_model <- droplevels(subset(dat, Number=="Numeric"))
dat_model$c_Vag <- ifelse(dat_model$Vagueness=="Crisp", -0.5, 0.5)
dat_model$c_Itm <- ifelse(dat_model$Item=="06:15:24", -.75, ifelse(dat_model$Item=="16:25:34", -.25, ifelse(dat_model$Item=="26:35:44", .25, .75)))
rtFullModel_num <- lmerTest::lmer(1 + RT_log ~ c_Vag + c_Itm + (1 + c_Vag + c_Itm | Subject), dat_model)
pretty_coef_table(rtFullModel_num, "rtFullModel_num")
| term | 𝛽 | s.e. | d.f. | 𝑡 | 𝑝 | Pr(>|t|) | sig. |
|---|---|---|---|---|---|---|---|
| c_Vag | 0.093 | 0.021 | 28.77 | 4.51 | 9.99e-05 | <0.001 | *** |
| c_Itm | 0.064 | 0.024 | 28.79 | 2.66 | 1.27e-02 | <0.05 | * |
Verbal only
dat_model <- droplevels(subset(dat, Number=="Verbal"))
dat_model$c_Vag <- ifelse(dat_model$Vagueness=="Crisp", -0.5, 0.5)
dat_model$c_Itm <- ifelse(dat_model$Item=="06:15:24", -.75, ifelse(dat_model$Item=="16:25:34", -.25, ifelse(dat_model$Item=="26:35:44", .25, .75)))
rtFullModel_verb <- lmerTest::lmer(1 + RT_log ~ c_Vag + c_Itm + (1 + c_Vag + c_Itm | Subject), dat_model)
pretty_coef_table(rtFullModel_verb, "rtFullModel_verb")
| term | 𝛽 | s.e. | d.f. | 𝑡 | 𝑝 | Pr(>|t|) | sig. |
|---|---|---|---|---|---|---|---|
| c_Vag | 0.024 | 0.016 | 29.03 | 1.46 | 1.55e-01 | 0.155 | |
| c_Itm | 0.175 | 0.016 | 28.88 | 11.04 | 7.02e-12 | <0.001 | *** |
Borderline model
dat_model <- dat_borderline
dat_model$c_Vag <- ifelse(dat_model$Vagueness=="Crisp", -0.5, 0.5)
dat_model$c_Num <- ifelse(dat_model$Number=="Verbal", -0.5, 0.5)
dat_model$c_Itm <- ifelse(dat_model$Item=="06:15:24", -.75, ifelse(dat_model$Item=="16:25:34", -.25, ifelse(dat_model$Item=="26:35:44", .25, .75)))
blFullModel <- lme4::glmer(isBorderline ~ c_Vag * c_Num + c_Itm + (1 + c_Vag * c_Num + c_Itm | Subject), dat_model, family="binomial", control = glmerControl(optimizer = "bobyqa"))
pretty_coef_table(blFullModel, "blFullModel")
| term | 𝛽 | s.e. | 𝑧 | 𝑝 | Pr(>|z|) | sig. |
|---|---|---|---|---|---|---|
| c_Vag | 0.66 | 0.224 | 2.96 | 3.13e-03 | <0.01 | ** |
| c_Num | 3.33 | 0.226 | 14.71 | 5.87e-49 | <0.001 | *** |
| c_Itm | 0.38 | 0.098 | 3.84 | 1.21e-04 | <0.001 | *** |
| c_Vag:c_Num | 0.69 | 0.420 | 1.64 | 1.02e-01 | 0.102 |
On the basis of the initial full model:
However, given that the plot shows that responses to 6:15:24 in the “crisp numeric” instructions condition were extremely fast relative to the “vague numeric” instructions to 6:15:24, the effects in the model of the full dataset could be driven by this difference.
A clearer picture of the effects of interest might be obtained by removing the 6:15:24 level of Item from the data set, and fitting the model to this restricted data. Doing this results in the effects tabled below.
With less data available, the model formula had to be simplified in order to converge – specifically the following terms were dropped: per-subject slopes for the Vagueness by Item interaction, and per-subject slopes for the effect of Item.
Model of the data after the 6:15:24 level of Item is removed
dat_model <- droplevels(subset(dat, Item!="06:15:24"))
dat_model$c_Vag <- ifelse(dat_model$Vagueness=="Crisp", -0.5, 0.5)
dat_model$c_Num <- ifelse(dat_model$Number=="Verbal", -0.5, 0.5)
dat_model$c_Itm <- ifelse(dat_model$Item=="16:25:34", -.3333, ifelse(dat_model$Item=="26:35:44", .0000, .3333))
rtRestrictedModel <- lmerTest::lmer(1 + RT_log ~ c_Vag * c_Num + c_Itm + (1 + c_Vag + c_Num | Subject), dat_model)
pretty_coef_table(rtRestrictedModel, "rtRestrictedModel")
| term | 𝛽 | s.e. | d.f. | 𝑡 | 𝑝 | Pr(>|t|) | sig. |
|---|---|---|---|---|---|---|---|
| c_Vag | 0.015 | 0.013 | 186.49 | 1.18 | 2.40e-01 | 0.24 | |
| c_Num | 0.359 | 0.076 | 29.00 | 4.70 | 5.80e-05 | <0.001 | *** |
| c_Itm | 0.023 | 0.023 | 5493.08 | 0.99 | 3.23e-01 | 0.323 | |
| c_Vag:c_Num | -0.022 | 0.025 | 5493.11 | -0.89 | 3.74e-01 | 0.374 |
Numeric only after the 6:15:24 level of Item is removed
dat_model <- droplevels(subset(dat, Item!="06:15:24" & Number=="Numeric"))
dat_model$c_Vag <- ifelse(dat_model$Vagueness=="Crisp", -0.5, 0.5)
dat_model$c_Itm <- ifelse(dat_model$Item=="16:25:34", -.3333, ifelse(dat_model$Item=="26:35:44", .0000, .3333))
rtRestrictedModel_num <- lmerTest::lmer(1 + RT_log ~ c_Vag + c_Itm + (1 + c_Vag | Subject), dat_model)
pretty_coef_table(rtRestrictedModel_num, "rtRestrictedModel_num")
| term | 𝛽 | s.e. | d.f. | 𝑡 | 𝑝 | Pr(>|t|) | sig. |
|---|---|---|---|---|---|---|---|
| c_Vag | 0.0049 | 0.020 | 626.61 | 0.25 | 8.03e-01 | 0.803 | |
| c_Itm | -0.1644 | 0.036 | 2676.03 | -4.60 | 4.39e-06 | <0.001 | *** |
Verbal-only after the 6:15:24 level of Item is removed
dat_model <- droplevels(subset(dat, Item!="06:15:24" & Number=="Verbal"))
dat_model$c_Vag <- ifelse(dat_model$Vagueness=="Crisp", -0.5, 0.5)
dat_model$c_Itm <- ifelse(dat_model$Item=="16:25:34", -.3333, ifelse(dat_model$Item=="26:35:44", .0000, .3333))
rtRestrictedModel_verb <- lmerTest::lmer(1 + RT_log ~ c_Vag + c_Itm + (1 + c_Vag + c_Itm | Subject), dat_model)
pretty_coef_table(rtRestrictedModel_verb, "rtRestrictedModel_verb")
| term | 𝛽 | s.e. | d.f. | 𝑡 | 𝑝 | Pr(>|t|) | sig. |
|---|---|---|---|---|---|---|---|
| c_Vag | 0.026 | 0.019 | 31.21 | 1.41 | 1.70e-01 | 0.17 | |
| c_Itm | 0.202 | 0.031 | 109.46 | 6.57 | 1.75e-09 | <0.001 | *** |
On the basis of the restricted model:
This experiment tested whether vague instructions would result in faster responses than crisp instructions, when borderline cases were present. Faster responses for vague instructions were found in pilot experiment B, but there were no borderline cases in that experiment.
In this experiment we found in contrast that vague instructions resulted in slower responses than crisp instructions: a difference that was significant when considering the full data (112ms), but which was not significant after removing the smallest arrays from the analysis, which had a pattern opposite to the main trends in the rest of the data.
We also found that the effect of instruction format was significant, with numerical format slowing responses by 689 ms on average, such that the disadvantage of numerical format overwhelmed the contribution of vagueness. The verbal vague condition still yielded faster responses than the numerical crisp condition, so the pattern from pilot experiment B was reproduced, but in the light of the evidence from this experiment (Experiment 1), in the presence of borderline cases, the advantage that was ascribed to vagueness before now looks more like an advantage of verbal instruction format.
However, once again there is a possibly confounding factor. Observe that, in Experiment 1, instruction format (i.e., the difference between numeric and verbal) went hand in hand with might be called the (human) “selection algorithm”: To see this, consider the task of selecting the dot array that contains “few dots”:" to do this, it suffices to compare the three arrays and select the one that contains the fewest elements. To select the dot array that contains “16 dots” seems to require the participant to estimate, and then match, the cardinality of (at least) one dot array to 16, a process which could plausibly take longer, independently of vagueness. Therefore, our results so far permit the interpretation that what made the instructions in the verbal condition fast is not the fact that they were worded verbally, but that they allowed participants to use comparison rather than having to resort to matching.
In the next two experiments we pitted the comparison algorithm and matching algorithm selection tasks against each other while controlling vagueness and instruction format. In Experiment 2 we restricted all the instructions to numeric quantifiers while factorially manipulating vagueness and selection task. In Experiment 3 we ensured that all instructions used verbal quantifiers, while also factorially manipulating vagueness and selection task. This allowed us to distinguish between the predictions of the selection task account and the instruction format account.